Skip to content

Conversation

@vincentsarago
Copy link
Member

@vincentsarago vincentsarago commented Oct 29, 2025

ref #1247
closes #1084

This PR does:

  • create a new Opener function using Obstore and Zarr (only for ZARR Dataset)
  • switch the default reader to open_zarr opener
  • add Zarr-python and Obstore as dependencies
  • create a default application (for Zarr)
  • remove python 3.10 support

ToDo

  • add tests for the default application
  • update documentation
  • add example for application using fsspec opener

"""test reader."""
src_path = protocol + os.path.join(protocol, prefix, filename)
with Reader(src_path, variable="dataset") as src:
with Reader(src_path, variable="dataset", opener=fs_open_dataset) as src:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reader(src_path,, opener=fs_open_dataset, ...) is the same as titiler.xarray.io.FsReader


# Fallback to Zarr
else:
store = zarr.storage.FsspecStore.from_url(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed support for zarr-python 2.0

strategy:
matrix:
python-version: ['3.10', '3.11', '3.12', '3.13']
python-version: ['3.11', '3.12', '3.13']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because titiler.xarray cannot support 3.10, it's just easier to also switch all package to >=3.11

@vincentsarago vincentsarago requested review from hrodmn and maxrjones and removed request for hrodmn October 29, 2025 13:03
Copy link
Contributor

@hrodmn hrodmn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding titiler.xarray.main! I ran uv run uvicorn titiler.xarray.main:app and it worked great out of the box, but I did have to set AWS_SKIP_SIGNATURE=True to get anonymous reads by default. I wonder how we might configure that type of setting at the application level.

Comment on lines 75 to 81
if "region" not in kwargs and infer_region:
# infer region or fallback to env variables
region_name_env = (
os.environ.get("AWS_REGION", os.environ.get("AWS_DEFAULT_REGION"))
or None
)
else:
store = fsspec.filesystem(protocol).get_mapper(src_path)
config["region"] = _find_bucket_region(parsed.netloc) or region_name_env
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe throw an http exception if region can't be discovered

fs = fsspec.filesystem(protocol, **kwargs)
ds = xarray.open_dataset(fs.open(src_path), **xr_open_args)

# Fallback to Zarr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to take zarr support out of this opener and have the application layer choose xarray_open_dataset or open_zarr? maybe it is nice to make this one reader that can open anything though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in theory you could but I think for compatibility issue it's better to keep everything separated.

maybe it is nice to make this one reader that can open anything though.

User could do it if they need

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@vincentsarago
Copy link
Member Author

Thanks for adding titiler.xarray.main! I ran uv run uvicorn titiler.xarray.main:app and it worked great out of the box, but I did have to set AWS_SKIP_SIGNATURE=True to get anonymous reads by default. I wonder how we might configure that type of setting at the application level.

That's a main issue with Obstore, it doesn't do any guesses for the AWS Region or if the data is public (ref: developmentseed/obstore#321)

We're already have a hack to guess the region, maybe we could implement something to guess if we need to set skip_signature 🤷

@hrodmn
Copy link
Contributor

hrodmn commented Oct 30, 2025

We're already have a hack to guess the region, maybe we could implement something to guess if we need to set skip_signature 🤷

It isn't pretty, but could we add anonymous and region as optional args in the tile endpoints, then add an endpoint like /storage that would do the guess and check to find out those parameters? It doesn't seem right to have to do the same guess and check for each tile request.

Actually, if we add anonymous and region as query parameters and return informative 400 errors to the user, we could prompt them to set anonymous=True and region={region} depending on the error we get back from obstore.

@vincentsarago
Copy link
Member Author

The open_zarr has a LRU cache so ideally it shouldn't be called too many time for a specific Zarr dataset

we could add region as query parameter but for the anonymous it's a bit different. If the environment has correct AWS credentials it will work fine on public/private dataset, the skip_signature=True is only useful if there are no credentials in the env

@vincentsarago
Copy link
Member Author

Ok I've changed some logic

As mentioned before s3:// url will be considered private so S3 credentials should be present in the environment.

if user want to access public s3 object they would need to use https:// url (we will set skip_signature=True)

if not expr.groupdict().get("region"):
config["region"] = (
_find_bucket_region(bucket) or region_name_env
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to support old style URL https://{bucket}s3.amazonaws.com/...

@vincentsarago vincentsarago requested a review from hrodmn October 30, 2025 14:17
@maxrjones
Copy link
Member

FYI this type of wrapper - virtual-zarr/obspec-utils#1 - provides a way to use xarray + obstore with libraries that expect file handles (e.g., netcdf4/h5netcdf). That would help if the goal is just to limit dependencies, but wouldn't help if the goal is to simplify code complexity by removing support for data formats other than zarr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[titiler.xarray] Use obstore to access remote datasets in xarray_open_dataset

5 participants